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We study fragmentation trees of Gibbs type. In the binary case, we identify tlie most general 
Gibbs-type fragmentation tree with Aldous' beta-sphtting model, which has an extended pa- 
rameter range /3 > —2 with respect to the beta(/3+ + 1) probability distributions on which 
it is based. In the multifurcating case, we show that Gibbs fragmentation trees are associated 
with the two-parameter Poisson-Dirichlet models for exchangeable random partitions of N, with 
an extended parameter range < a < 1, > —2a and a <0, 6 — —ma, m £ N. 

Keywords: Aldous' beta-splitting model; Gibbs distribution; Markov branching model; 
Poisson-Dirichlet distribution 

1. Introduction 

We arc interested in various models for random trees associated with processes of re- 
cursive partitioning of a finite or infinite set, known as fragmentation processes [2, 4, 9]. 
We start by introducing a convenient formalism for the kind of combinatorial trees aris- 
ing naturaUy in this context [16, 18]. Let #5 be the number of elements in the finite 
non-empty set B. Following standard terminology, a partition of _B is a collection 

TTB = {Bl,...,Bk} 

of non-empty disjoint subsets of B whose union is B. To introduce a new terminology 
convenient for our purpose, we make the following recursive definition. A fragmentation 
of B (sometimes called a hierarchy or a total partition) is a collection of non-empty 
subsets of B such that 

(i) BetB] 

(ii) if #5 > 2 then, there is a partition tib of B into k parts, Bi, . . . , Bk, called the 
children of B, for some fc > 2, with 

tB^{B}UtB,U---UtB„ (1) 
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Figure 1. Two fragmentations of [9] grapliically represented as trees labeled by subsets of [9]. 
where t^. is a fragmentation of Bi for each 1 <i < k. 

Necessarily, Bi £ ts, each child Bi of B with > 2 has further children, and so on, 
until the set B is broken down into singletons. We use the same notation both 

• for such a collection of subsets of B, and 

• for the tree whose vertices are these subsets of B and whose edges are defined by 
the parent/child relation determined by the fragmentation. 

To emphasize the tree structure, we may call a fragmentation tree. Thus, B is the root 
of t_B and each singleton subset of i? is a leaf of t_B (see Figure 1 - here [9] = {1, . . . , 9}; 
we also put [n] = {1, . . . , n}). We denote by T_b the collection of all fragmentations of B. 
A fragmentation €Tb is called binary if every A€tB has either or 2 children. We 
denote by C the collection of binary fragmentations of B. 

For each non-empty subset A of B, the restriction to A oftg, denoted t^^^, is the 
fragmentation tree whose root is A, whose leaves are the singleton subsets of A and 
whose tree structure is defined by restriction of t^- That is, tA,B is the fragmentation 
{C n A : C n A ^ 0, C e ts} e T^, corresponding to a reduced subtree, as discussed by 
Aldous [1]. 

Given a rooted combinatorial tree with no single-child vertices and whose leaves are 
labeled by a finite set B, there is a corresponding fragmentation t^, where each vertex 
of the combinatorial tree is associated with the set of leaves in the subtree above that 
vertex. So the fragmentations defined here provide a convenient way to label the vertices 
of a combinatorial tree and to encode the tree structure in the labeling. 

A random fragmentation model is an assignment, for each finite subset B of N, of a 
probability distribution on for a random fragmentation Tb of B. We assume through- 
out this paper that the model is exchangeable, meaning that the distribution of Tb is 
invariant under the obvious action of permutations of B on fragmentations of B. The 
distribution of IIb, the partition of B generated by the branching of Tb at its root, is 
then of the form 



for all partitions {Bi, . . . , B^} with k >2 blocks and some symmetric function p of com- 
positions of positive integers, called a splitting probability rule. The model is called 



F{I1b = {Bu..., Bk}) = p{#Bi,. . . , #Bk) 



(2) 
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• consistent if for every Ad B, the restricted tree Ta^b is distributed like Ta', 

• Markovian if, given Hb = {Bi, . . . , B^}, the k restricted trees Tb^^b, • ■ • , Tb^,b are 
independent and distributed as Tbi , ■ • ■ , Is^ ; 

• binary if Tb is a binary tree with probabihty one, for every B. 

Aldous [2] initiated the study of consistent Markovian binary trees as models for neutral 
evolutionary trees. He observed parallels between these models and Kingman's theory 
of exchangeable random partitions of N, and posed the problem of characterizing these 
models analogously to known characterizations of the Ewcns sampling formula for random 
partitions. In [9], we showed how consistent Markovian trees arise naturally in Bertoin's 
theory of homogeneous fragmentation processes [4] and deduced from Bertoin's theory a 
general integral representation for the splitting rule of a Markovian fragmentation model. 

To briefly review these developments in the binary case, the distribution of a Markovian 
binary fragmentation Tb is determined by a splitting rule p, which is a symmetric function 
p of pairs of positive integers (i, j), according to the following formula for the probability 
of a given tree t £ Bs : 



where Ai and A2 denote the two children of A in the tree Tb- 
The following proposition collects some known results. 

Proposition 1. (i) Every non-negative symmetric Junction p subject to normalization 
conditions 



defines a Markovian binary fragmentation model. 

(ii) A splitting rule p gives rise to a consistent Markovian binary fragmentation if 
and only if 



nTB=t)= n p(#Ai,#A2) 



(3) 



Aet:#A>2 




for all n>2 



for all i,j>l. 



(4) 



(iii) Every consistent splitting rule admits an integral representation 




for all i,j>l, 



(5) 



with characteristics c > and v a symmetric measure on (0, 1) with 
00, and Z{n) a sequence of normalization constants. 



(0,1) 



x{l — x)iy{dx) < 



Proof, (i) is elementary. For (ii), Ford [6], Proposition 41, gave a characterizaton of 
consistency for models of unlabeled trees which is easily shown to be equivalent to the 
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condition stated here. The interpretation (and sketch of proof) of this condition is that 
for B = C U {k} (with k ^ C), the vertex C of Tc sphts into a particular partition of 
sizes i and j if and only if Tb splits into that partition with k added to one or the other 
block, or if Tb first splits into C and {k} and then C splits further into that partition 
of sizes i and j. (iii) is directly read from [9]. □ 

Aldous [2] studied in some detail the beta- splitting model which arises as the particular 
case of (5) with characteristics c = and 

iy{dx)=x^{l-xfdx for /3 e (-2, cx)) and v{dx) 5i/2idx) for/3 = cxj. (6) 

Aldous posed the problem of characterizing this model among all consistent binary 
Markov models. The main focus of this paper is the following result. 



Theorem 2. Aldous' beta- splitting models for (3 G (— 2,oo] are the only consistent 
Markovian binary fragmentations with splitting rule of the form 

w{i)w{j) . „ . ■ ^ 1 

P[i,l) = — : r foralli,]>l, (7) 

F\ ,J! Z{i+j) J - ' W 

for some sequence of weights w{j) > 0, j > 1, and normalization constants Z{n), n>2. 



As a corollary, we extract a statement purely about measures on (0, 1). 

Corollary 3. Every symmetric measure v on (0, 1) with J^^ x{l — x)i'{dx) < oo, whose 
moments factorize into the form 

/ x^ {1 — xy I'ldx) ~ w{i)w{j) foralli,j>l 

for some w{i) >0, i>l, is a multiple of one of Aldous' beta- splitting measures (6). 

In particular, this characterizes the symmetric beta distributions among probability 
measures on (0, 1). 

Berestycki and Pitman [3] encountered a different one-dimensional class of Gibbs split- 
ting rules in the study of fragmentation processes related to the affine coalescent. These 
are not consistent, but the Gibbs fragmentations are naturally embedded in continuous 
time. 

The rest of this paper is organized as follows. Section 2 offers an alternative char- 
acterization of what wc call binary Gibbs models, meaning models with splitting rule 
of the form (7), without assuming consistency. Theorem 2 is then proved in Section 3. 
In Section 4, we discuss growth procedures and embedding in continuous time for the 
consistent case. Section 5 gives a generalization of the Gibbs results to multifurcating 
trees. 
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2. Characterization of binary Gibbs fragmentations 

The Gibbs model (7) is overparameterized: if we multiply w{k), k > 1, by ab'^ (and 
then Z{m), m > 2, by a^fe"'), the model remains unchanged. Note, further, that neither 
w{l) = nor w{2) = is possible since then (7) does not define a probability function for 
n = 2 + j = 3. Hence, we may assume w{l) = 1 and w(2) = 1. It is now easy to see that 
for any two different such sequences, the models are different. Note that the following 
result does not assume a consistent model. 



Proposition 4. The following two conditions on a collection of random binary fragmen- 
tations Tb indexed by finite subsets B of N are equivalent: 

(i) Tb is for each B an exchangeable Markovian binary fragmentation with splitting 
rule of the Gibbs form (7) for some sequence of weights w{j) > 0, j > 1, and normaliza- 
tion constants Z{n), n>2; 

(ii) for each B, the probability distribution of Tb is of the form 

^^^B=t) = —^\lii^A) forallteMB, (8) 

for some sequence of weights ip^j) > 0, j > 1, and normalisation constants w{n), n>\. 
More precisely, if (i) holds with w{\) ~ 1, then (ii) holds for the same sequence w with 

4'{1) = 1 and ^{k)=w{k)/Z{k), k>2. (9) 

Conversely, if (ii) holds for some sequence tp with = 1, then (i) holds for the sequence 
w{n), n>l, determined by (8); in particular, w{\) = 1. 

Proof. Given a Gibbs model with w{l) = 1, we can combine (3) and (7) to get, for all 

If we make the substitution (9), we can read off w(n) as the correct normalization constant 
and (8) follows, with '0(1) = 1. 

On the other hand, (8) determines the sequence w{n), n>l, as 

t6B[„] Aet 

Note, in particular, that w{\) = "0(1). We can express the normalization constants in the 
Gibbs model (7) by the formula 

m-l , ^ 

Z{m) = E ( T- 1 ) - k) (10) 
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k=i ^ ^ \tieB[fe]Aeti / \t2eB[,„_fc] Aeta / 

= n ^(#^)=wM/V'M, 

as in (9). By application of the previous implication from (i) to (ii), formula (8) gives 
the distribution of the Gibbs model derived from this weight sequence w{ri) and the 
conclusion follows. □ 

Note that the normalization constant Z(m) in the Gibbs splitting rule (7) model and 
given in (10) is a partial Bell polynomial in 'w{l),w{2), . . . (see [15] for more applications of 
Bell polynomials), whereas the normalization constant w{n) in the Gibbs tree formula (8) 
is a polynomial in ■0(1), ■0(2), ... of a much a more complicated form. The normalization 
constant in (8) is 

teB[„] Aet 

In an attempt to study this polynomial in 0(1), -0(2), . . . , we introduce the signature 
(7t : [n] ^ N of a tree t S B[„] by 

at(j)-#Met:#A = j}, j = l,...,n. 

Note that P(r„ = t) depends on t only via at, that is, at is a sufficient statistic for 
the Gibbs probabilities (8). Denote the set of signatures by Sig„ = {at :t G B[„]}. The 
inductive definition of B[„] yields 

Sig„ = {ad) + + 1„ : G Sig„^ , a'^) e Sig„^ , m + 7.2 = n}, 

where l,i(j) = 1 if j ~ n, l„(j) = otherwise. The coefficients Q^r in w{n), when expanded 
as a polynomial in 0(1), 0(2), . . . , are numbers of fragmentations with the same signature 
creSig„: 

n 

w{n)^ J2 O-tV'", where V'' = n'^(^T^^^ • 

o-6Sig„ 3 = 1 

Let us associate with each fragmentation t G B[„] its tree shape (combinatorial tree 
without labels) t° and denote by B° the collection of shapes of binary trees with n 
leaves. Clearly, two fragmentations with the same tree shape have the same signature, 
so we can define cr(t°) in the obvious way. For n < 8 (and many larger trees), direct 
enumeration shows that the tree shape t° G B° is uniquely determined by its signature 
a, and is just the number q{t°) of different labelings. For n>9, this is false: there 
are two tree shapes with signature (9,3,1,2,1,0,0,0,1); see Figure 2. If we denote by 
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1° C B° the set of tree shapes with signature tr, then Qa = X]t°Gi° ^he remaining 

combinatorial problem is therefore to study 1° and q{t°). We have not been able to solve 
this problem. The preprint version [12] of the present paper includes an Appendix with 
a partial study: sec also Corollary 2.4.3 of [17]. 



3. Consistent binary Gibbs rules 

The statement of Theorem 2 specifies Aldous' [2] beta-splitting models by their integral 
representation (5). Observe that the moment formula for beta distributions easily gives 

= ^Al±l±mL±£±}l for all., >1, 

for normalization constants R{n) = Z{n)T{n + 2/3 + 2). n>2. This is for (3 S (—2, oo). 
For P = oo, we simply get p{i,j) = l/R{i+j) for all i,j > 1, where R{n) ~ Z(n)2", n>2. 



Proof of Theorem 2. Wc start from a general Gibbs model (7) with w{l) ~ 1 and 
follow [7], Section 2 closely, where a similar characterization is derived in a partition 
rather than a tree context. Let the Gibbs model be consistent. This immediately implies 
that w{§) > for all j > 1. The consistency criterion (4) in terms of Wj = w{j + l)/w{j) 
now gives 



W, + W, 



Z{i + j + l)-w{i+j) 



for all i, j > 1. 



(12) 



The right-hand side is a function of i + j , so Wj+i — Wj is constant and hence Wj — a + bj 
for some 6 > and a > —6. Now. either 6 = (excluded for the time being) or 

i-i 

wij) = Wi---Wj.i = Y[{a + bq) 

9=1 





9 9 

Figure 2. Two tree shapes with the same signature (here marked by subtree sizes) 
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llU^V T{a/b+l) 



g=l 

and, hence, reparametcrizing by /3 = a/6 — 1 G (— 2, oo) and pushing V^^^"^ into the 
normahzation constant di^j = b^^^^"^ / Z{i + j), we have 

_ w{i)w{j) _ ^ T{i + l + p)TU + l + P) 

P^''-''- z{2 + j) r(2 + /3) r(2 + /3) ■ 

The case 6 = is the Umiting case /3 ~ oo, when, clearly, = 1 (now pushing a*"*"^"^ 
into the normalization constant). 

These arc precisely Aldous' beta-splitting models, as in (11). □ 

While we identified the boundary case /3 = oo as being of Gibbs type, the boundary 
case P = —2 is not of Gibbs type, although it can still be made precise as a Markovian 
fragmentation model with characteristics c > and 1^ = (pure erosion): p{i,j) = imless 
i = 1 or j = 1, so the Markovian fragmentations r„ are combs, where all 7i — 1 branching 
vertices are lined up in a single spine. 

In the proof of the theorem, we obtained as parameterization for the Gibbs models 
(7), 

r(j + i + /3) 
"^^)= r(2 + /3) ' 

for some /3 e (—2, oo), or w{j) = 1 for (5 ~ oo. Note that the simple convention w(2) = 1 
from Section 2 is not useful here. We can now still deduce the parameterization (8) by 
Proposition 4, in principle. However, since ip^k) = w{k) / Z{k) involves partial Bell polyno- 
mials Z{k) in u'(l), w{2), . . . , this is less explicit in terms of (3 than the parameterization 
(7)- 

«2, = 2 + ,, *(3).i±^, *(4) = fi±M 

Special cases that have been studied in various biology and computer science contexts 
(see Aldous [2] for a review) include the following: /3 ~ — 3/2, — 1, 0, oo. In these cases, 
we can explicitly calculate the Gibbs parameters in (7) and (8) and the normalisation 
constants. 

If /3 = —3/2, we can take ^^{n) = 1 and Tb is uniformly distributed: if ~ n, then 
P(ri3 = t) = 2"--i(n - l)!/(2n - 2)!, t e Bs. The asymptotics of uniform trees lead to 
Aldous' Brownian CRT [1]; see also [15], Section 6.3. Table 1 uses a different parameter- 
ization via the convenient relations (9) and (13). 

The case /? = — 1 is the limiting conditional distribution in the Ewens family as the 
Ewens parameter A — > 0, conditional on the occurrence of a split. The /3 = case is 
known as the Yule model and /3 = oo as the symmetric binary trie (see Aldous [2]). 
Continuum tree limits of the beta-sphtting model for (3 G (—2, —1) are described in [9]. 
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The normalization that leads to a compact limit tree is here T[n]/n^^^^ , where T[„] is 
represented as a metric tree with unit edge lengths and the scaling T[„]/ri,~'^~^ refers 
to scaling of edge lengths. Aldous [2] studies weaker asymptotic properties for average 
distance from a leaf to the root, also for j3 > —1, where growth is logarithmic. 



4. Growth rules and embedding in continuous time 

In [9], we study the consistently growing sequence T„, n > 1, where T„ := T[„] = T[„] 
is the restriction of T„+i to [n] for all n > 1, in a general context of consistent Marko- 
vian multifurcating fragmentation models. The integral representation (5) stems from an 
association with Bertoin's theory of homogeneous fragmentation processes in continuous 
time [4]. Let us here look at the binary case in general and Gibbs fragmentations in 
particular. 

Consider the distribution of Tn+i, given T„. The tree T„+i has a vertex A U {n + 1} 
with children {n + 1} and A G T„. We say that n+1 has been attached below A. In 
passing from T„ to T„_|_i, leaf n+1 can be attached below any vertex A of r„ (including 
[n] and all leaf nodes). Note that to construct Tn+i from T„, n + 1 is also added as an 
element to all vertices on the path from [n] to A. Vertex A G T„ is special in that both 
A and Au{n+ 1} are in T„+i. 

Fix a vertex A of t G B[„] and consider the conditional probability, given T„ = t, of 
n + 1 being attached below A. This is the ratio of two probabilities of the form (3) in 
which many common factors cancel so that only the probabilities along the path from 
[n] to A remain. This yields the following result. 

Proposition 5. Let t G B[„] and A G t. Denote by 

[n]=AiD---DAh = A 



Table 1. Closed form expressions of the parameters for /3 
-3/2,-1,0,00 

/3 -3/2 -1 oo 

^(n) ,^^""7^^' (n-1)! n! 1 

V ' 22"-2(n-l)! ^ ^ 

n-1 



n-1 



n-1 2"-i - 1 

j=i 
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the path from [n] to A. We refer to h> 1 as the height of A in t. The probability that 
n + 1 attaches below A is then 

For the uniform model (Gibbs fragmentation with /3 = —3/2), this product is telescop- 
ing, or we calculate directly from (8) 

("fr' p{#A,+, + 1, #{A, \ A,^^)) \ 1 

giving a simple sequential construction (see, e.g., [15], Exercise 7.4.11). 

It was shown in [9] that consistent Markovian fragmentation models can be assigned 
consistent independent exponential edge lengths, where the edge below vertex A is given 
parameter \^a, for a family (Am)m>i of rates, where Ai = 0, A2 is arbitrary and Am, 
TO > 3, is determined by A2 and the splitting rule p, in that consistency requires 

A„+i(l -p(n, 1)) = A„ foraUn>2. (14) 

The interpretation is that the partition of [n-\- 1] in Tn+i (arriving at rate A„+i) splits [n] 
only with probability 1 —p{n, 1) and this thinning must reduce the rate for the partition 
of [n] in Tn to An. This rate An also applies in T„+i after a first split {[n],{n+ 1}}. 
Using consistency, equation (14) also implies 

Kpii,j) = Xn+i{p{i,j + 1) +p{i + l,i)) for aU ij > 1 with i + j = n. 
For the Gibbs fragmentation models, we obtain, using (14), (7), (12) and (13), 

n-1 -, n-1 '7/-_i_T\ ""^ 1 

= n = n TT^Hf^ = n iT-^ 

7=2 ^ ~ P^-^' ^) J=2 ^^-^ ^"^^> j=2 ^1 + ^^'^1 

= AiZiyn) I I — — -, ; — = AiA{n] — -, 

^ ^ \\w{:i)w{3-\)^w{3) ^ ^r(n + 2 + 2/3)' 

where we require /3 < oo for the last step. Table 2 contains the rate sequences for = 
—3/2, —1, 0, oo in the case A2 = 1. 

Not only is (A„)„>3 determined by p, but a converse of this also holds. 

Proposition 6. Let (A„)„>2 be a consistent rate sequence associated with a consistent 
Markovian binary fragmentation model with splitting rule p, meaning that (14) holds. 
Then, p is uniquely determined by (A„)„>2. 



998 



P. McCullagh, J. Pitman and M. Winkel 



Proof. It is evident from (14) that p{n, 1) is determined for all n>2, and p(l, 1) = 1. 
Now, (4) for 1 = 1 determines p{i + for all j > 2, and an induction in i completes the 
proof. □ 

A more subtle question is to ask what sequences (A„)„>2 arise as consistent rate 
sequences. The above argument can be made more explicit to yield 

p(fc,n-fc) = ^^(-l)'=-^+i (':)X,,.,, l<k<n/2, 

which means that (A„)„>2 must have a discrete complete monotonicity, in that fcth 
differences of (A„)„>2 must be of alternating signs, fc > 1. This condition is not sufficient, 
however, as simple examples for n — 3 show (A„ = {n — 1)" is completely monotone for 
a £ (0, 1), but exchangeability implies that 1/3 = p(l, 2) = (A3 — A2)/A3 and so A3 = 3/2, 
whereas (3 — 1)" S (1, 2) - even in the multifurcating case, cf. Section 5, we always have 
A3 < 3/2). 

Proposition 7. A sequence (A„)„>2 arises as rate sequence of a consistent Markovian 
binary fragmentation model if and only if 



Xn=nc+ / {1 - x'' - {I - xy')iy{dx) 

J{Q,1) 

for some c > and v a symmetric measure on (0, 1) with J^p ^-^ x{l — x)v{dx) < 00. The 
characteristics of the splitting rules associated with (A„)„>2 are (c, i/). 

Proof. This is a consequence of the integral representation (5) and [9], Proposition 3. 
Specifically, the association with Bcrtoin's theory of homogeneous fragmentations yields 
that each of 1, ... ,n suffer erosion (being turned into a singleton) at rate c; the measure 
v(dx) gives the rate of fragmentations into two parts, to which l,...,n are allocated 
independently with probabilities (x, 1 — a;), hence splitting [n] with probability 1 — x" — 
(1-x)". □ 

The complete monotonicity is related to the study of the block containing 1, a tagged 
fragment; see [4, 10]. Since A„ is the rate at which one or more of {2, ... ,71} leave the 



Table 2. Explicit rate sequences for /3 — —3/2, —1, 0, 00 





-3/2 


-1 





00 


An 


n - 1 / 2n - 2 \ 

22n-3 \yn-l J 




3n — 3 
n+1 


2(1-2-'"-')). 
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block containing 1, the rate is composed of three components - a rate c for the erosion 
of 1, a rate (n — l)c for the erosion of 2, . . . ,n and a rate A(dz) of fragmentations into 
two parts, to which 2, . . . , n are aUocated independently with probabilities (e~^ , 1 — c^^ ) , 
with 1 in the former part, hence splitting [n] with probability 1 — e~("~^)^. Therefore 

A„=c+(n-l)c+ / (l-e-("-i'")A(dz)^cn+ / ^7^" V (dg) = $(n~l) 

J(0,oo) J (0,1) 1~? 

for a Bernstein function $, a finite measure fi on (0, 1) or a Levy measure A on (0,oo) 
with ^^(1 Aa;)A(da;) < oo; (see [4, 8, 10]), that is, A„ can be extended to a completely 

monotone function of a real parameter. 



5. Multifurcating Gibbs fragmentations and 
Poisson— Dirichlet models 

As a generalization of the binary framework of the previous sections, we consider in this 
section consistent Markovian fragmentation models with splitting rule p as in (2) of the 
Gibbs form 

a(fc) -pj 

p(ni, . . . ,nfc) = — — I I ii;(nj) (15) 

for some w{j) > 0, j > 1, a{k) > 0, k >2, and normalization constants c{n) > 0, n > 2. 
Note that wc must have w{l) > and a(2) > to get positive probabilities for n = 2. To 
remove overparameterization, we will assume w{l) = 1 and a(2) = 1. Also, if we multiply 
^(j) by and a{k) by b'^ (and c{n) by 6"), the model remains unchanged. We will 
use this observation to get a nice parameterization in the consistent case (Theorem 8 
below). 

In [9] , we showed that consistency of the model is equivalent to the set of equations 
p{ni, . . . , rife) = p{ni + 1, Ji2, . . . , n^) H hp(rii, . . . , + 1) +p{ni, Uk, 1) 

(16) 

+ p{ni H |-nfc,l)p(ni,...,nfc) 

for all ni, . . . , nfc > 1, fc > 2. Wc also established an integral representation extending (5) 
to the multifurcating case. The special case relevant for us is in terms of a measure V on 
5''' = {s = (s,;),;>i :si> S2> ■ ■ ■ >0,si + S2 + ■ ■ ■ = 1} Satisfying /^^ (1 — si)77(ds) < oo: 

"'"--"^'^ zin. I. 5; n»r/w. (17) 

^ ■ii,...,4fc distinctj = l 

The general case has a further parameter c > 0, as in (5), and also allows V to charge 
(sj)i>i with si + S2 + • • ■ < 1; see [9]. We will only meet the extreme case p(l, . . . , 1) = 1, 
which corresponds to F = (5(o.o,...)- 
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We set 

a{k + l) _ . c{n + 1) _ w{n + l) _ 

a(K) c[n) w(n) 

and, in analogy to Proposition 5, we find that, given r„ = t e T[„] , for each vertex i? G t, 
the probability that n + 1 attaches below B is 

a{2)w{nh)w{l) 
c{nh + l) 

where [n] D Si Z) ■■■ D Sh ^ B is the path from [n] to B, Uj = i^Sj and kj denotes the 
number of children of S'j , j = 1, /i. 

However, n + 1 can also attach as a singleton block to an existing partition . . . , Bk\ 
of -B £ T„. In this case, we say that n + 1 attaches to the vertex B. For each non-leaf 
vertex i? G t, the probability that n + 1 attaches to the vertex B is 





In this framework, we have the following generalization of Theorem 2 to the multifurcat- 
ing case. 

Theorem 8. If p is of the Gibbs form (15) and consistent, then p is associated with the 
two-parameter Ewens-Pitman family given by 

1 (1 — a) 1 (2 + t^/a) 

(^or limiting quantities a iO), c{n), n>l, being normalization constants, for a parameter 
range extended as follows: 

• either < a < 1 and 6 > —2a (multifurcating cases with arbitrarily high block num- 
bers), 

• or a < and 9 ~ —ma for some integer to > 3 (multifurcating with at most m 
blocks), 

• or a <1 and 9 = —2a (binary case), 

• or a = —00 and 9 = m for some integer m>2, that is, a{2) = \, a[k) — (to — 
2) ■ ■ ■ (m — k -\-\), k > 3, and w{j) = 1 (recursive coupon collector, where a split of 
[n] is obtained by letting each element of [n] pick one of m coupons at random, just 
conditioned so that at least two different coupons are picked), 

• or a — I, that is, w(l) = I, w{j) — 0, j >2 (deterministic split into singleton blocks). 

In terms of the integral representation (17), the measure V on is, respectively, size- 
ordered Poisson~Dirichlet{a, 9), Dirichlet{—a,...,—a), Beta{—a,—a), 6(^i/m,....i/m) o.'^^d, 
^(0,0,...)- 
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Proof. For the Gibbs fragmentation model with w{l) = a{2) = 1 and w{j) > for all 
j >2 with notation as introduced, consistency (16) is easily seen to be equivalent to 

Cn = Wn,+--- + Wn,+Ak + for aU 711 + ■ • • + nfc = n, (18) 

c[n) 

where k <m if m = mf{i > 1 : a{i + 1) = 0} < oo. 

As in the proof of Theorem 2, we deduce from this (the special case fc = 2) that either 
Wj ~ a> (excluded for the time being as 6 = 0) or 

Wj^a + bj w{j) = Wi...Wj-i=V~^^^^—^ foraUj>l, 

I [1 — a) 

for some 5 > 0, a > —b and a :~ —a/b < 1. As noted above, we can reparameterize so 
that we get 6=1 without loss of generality. In particular, Wj = j — a, j >1, and so (18) 
reduces to 

Cn = n - ka + Ak -\ ^ for all 2 < fc < to A n. 

c{n) 

Similarly, we deduce that 9 := Ak — ka does not depend on fc and so a{k) = 6^^^ if a = 0, 
and otherwise, 

Ak = e + ka ^ a{k)=A2...Ak-i^a^-'^ ^}ll^^J'^\ for all 2 < fc < m + 1. 

1 (2 + 0/a) 

Note that this algebraic derivation leads to probabilities in (15) only in the following 
cases. 

• If < a < 1, then a(3) = A2 = + 2q > if and only \i 6 > —2a, and then also 
Ak = 9 + ka>Q and a(fc) > for all fc > 3. 

• If a < 0, then a(3) = A2 = + 2a > if and only if > -2a also, but then Ak = + ka 
is strictly decreasing in fc and Ak < eventually, which impedes to = 00. If we have 
m < 00, we achieve a (to + 1) = if and only ii 9 = —ma. The iteration only takes 
us to a(m + 1) = and we specify a(fc) = for k> m also. We cannot specify a(fc), 
fc > TO + 1, differently, since every consistent Gibbs fragmentation with a(fc) > 
for fc > TO + 1 has the property that T^k] = {[fc], {1}, . . . , {fc}} has only one branch 
point [fc] of multiplicity fc with positive probability, but then the restricted tree 
T[rn+i],[k] {[jTi + 1], {1}, . . . , {to + 1}} with positivc probability, which contradicts 
a(TO + l) = 0. 

• If a(3) = 0, that is, to = 2, the argument of the preceding bullet point shows that we 
are in the binary case a(fc) = for all fc > 3 and we can conclude by Theorem 2. 

• The case 5 = is the limiting case a = — 00 with w{j) = 1. We take up the argument 
to see that Ak = 9 — k and so to < cxd and 9 = m, where we then get a(2) = 1 and 
a(fc) = (to - 2) • • • (to - fc + 1), 3 < fc < TO + 1. 
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Finally, if w{m) = for some m > 2, then consistency imposes 'w{j) = for all j > to, 
and it follows from the integral representation (17) that in fact 'w{j) = for all j > 2. 
The identification of F on the standard parameter range can be read from [15], Section 
3.2. For the extension —a >9> —2a, we refer to [10]. □ 

Kerov [11] showed that the only exchangeable partitions of N of Gibbs type are of 
the two-parameter family PD{a,9) with usual range for parameters 6 > —a, etc.; see 
also [7, 14]. Theorem 8 is a generalization to splitting rules that allows an extended 
parameter range for the same reason as in the binary case: the trivial partition of one 
single block is excluded from p and when associating consistent exponential edge lengths 
with parameters A,n, to > 1, the first split of [to+ 1] happens at a higher and higher rate 
and we may have A„j — > cxd. In fact, 

K{{7r e T'n : 7rl[„] = {Bi, . . . , Bfe}}) = A„p(#Bi, . . . , #5^) 

uniquely defines a cr- finite measure on Vn \ {N}, the set of non-trivial partitions of N, 
associated with a homogeneous fragmentation process. This is closely related to (17) 
via Kingman's paintbox representation k = J^^ Ks77(ds). The extended range was first 
observed by Miermont [13] in the special case 9 = —1 (related to the stable trees of 
Duquesne and Le Gall [5]). 

We refer to [10] for a study of spinal partitions of Markovian fragmentation models. 
There are notions of fine and coarse spinal partitions. First, remove from T„ the spine 
of 1, that is, the path from [n] to {1}. The resulting collection is a disjoint union of 
fragmentations of sets Bj, say, that form a partition of {2, . . . , n}, which is called the fine 
spinal partition. Second, merge blocks (in the multifurcating case) that were children of 
the same spinal vertex; the resulting partition is called the coarse spinal partition. It is 
shown that for the splitting rules from the two-parameter family with parameters a and 
9 (the Gibbs fragmentations), the fine partition is obtained from the coarse partition by 
applying independently for each block of the coarse partition an exchangeable partition 
from the two-parameter family of random partitions, with parameters a and a + 9. 
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